Keyword and metadata extraction from pre-prints

نویسندگان

  • Emma Tonkin
  • Henk L. Muller
چکیده

In this paper we study how to provide metadata for a pre-print archive. Metadata includes, but is not limited to, title, authors, citations, and keywords, and is used to both present data to the user in a meaningful way, and to index and cross-reference the pre-prints. We are particularly interested in studying different methods to obtain metadata for a pre-print. We have developed a system that automatically extracts metadata, and that allows the user to verify and correct metadata before it is accepted by the system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Keyword Extraction from the Web for FOAF Metadata

With the currently growing interest in the Semantic Web, metadata is becoming to play an important role in the Web. As one of forthcoming metadata standards for the Semantic Web, FOAF defines an RDF vocabulary for expressing metadata about people and the relation between people. In this paper we propose the novel keyword extraction method to extract FOAF metadata from the Web. The proposed meth...

متن کامل

Keyword Extraction from the Web for Creation of Person Metadata

As one emerging metadata standard for the Semantic Web, FOAF defines an RDF vocabulary for expressing metadata about people and the relation among them. This paper proposes a novel keyword extraction method to extract FOAF metadata from the Web. The proposed method is based on co-occurrence information of words. Our method extracts relevant keywords depending on the context of a person. Our exp...

متن کامل

Keyword Extraction from the Web for Personal Metadata Annotation

With the currently growing interest in the Semantic Web and Social Networking, personal metadata is coming to play an important role in the Web. This paper proposes a novel keyword extraction method to extract personal metadata from the Web. The proposed method is based on co-occurrence information of words. Our method extracts relevant keywords depending on the context of a person. Our experim...

متن کامل

Automatic Keyword Extraction for Learning Object Repositories

Introduction Learning object repositories are digital collections of educational materials, e.g., lectures, notes, presentations, which can be used to support learning. The main purpose of such repositories is to improve the sharing and reusability of the learning objects, which can be defined as “any digital resource that can be reused to support learning” (Wiley, 2000, p. 7). An important asp...

متن کامل

Exploring Multidimensional Continuous Feature Space to Extract Relevant Words

With growing amounts of text data the descriptive metadata become more crucial in efficient processing of it. One kind of such metadata are keywords, which we can encounter e.g. in everyday browsing of webpages. Such metadata can be of benefit in various scenarios, such as web search or contentbased recommendation. We research keyword extraction problem from the perspective of vector space and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008